The number of neutral mutants in an expanding Luria-Delbrück population is approximately Fréchet

Background: A growing population of cells accumulates mutations. A single mutation early in the growth process carries forward to all descendant cells, causing the final population to have a lot of mutant cells. When the first mutation happens later in growth, the final population typically has fewer mutants. The number of mutant cells in the final population follows the Luria-Delbrück distribution. The mathematical form of the distribution is known only from its probability generating function. For larger populations of cells, one typically uses computer simulations to estimate the distribution. Methods: This article searches for a simple approximation of the Luria-Delbrück distribution, with an explicit mathematical form that can be used easily in calculations. Results: The Fréchet distribution provides a good approximation for the Luria-Delbrück distribution for neutral mutations, which do not cause a growth rate change relative to the original cells. Conclusions: The Fréchet distribution apparently provides a good match through its description of extreme value problems for multiplicative processes such as exponential growth.

The distribution of the number mutants, m, is known as the Luria-Delbrück distribution 1 . That distribution is widely used to estimate the mutation rate. The distribution also arises when studying the amount of mutational mosaicism within multicellular individuals 2-4 .
Currently, for experiments with a small number of mutational events, one typically calculates the distribution with a probability generating function 5,6 . However, that approach becomes numerically inaccurate for larger numbers of mutational events, in which case the distribution is calculated by computer simulation.
This article shows that the Fréchet distribution provides a good approximation for the number of neutral mutants. In particular, the probability that the number of mutants, m, is less than z is approximately in which exp(z) = e z is the exponential function. The probability of being in the upper tail, m > z, is 1 − F(z). The three parameters set the shape, α, the scale, s, and the minimum value, β, such that z, m > β.
This form of the Fréchet distribution has three parameters. I found that the following parameterization matches closely the Luria-Delbrück process for neutral mutations ( ) in which e is the base of the natural logarithm. This parameterization depends on the single parameter, Nu, the final population size times the mutation rate. Figure 1 shows the good fit. Two aspects of mismatch occur. First, the number of mutants is discrete, whereas the Fréchet is continuous. As Nu declines to one, significant amounts of probability mass concentrate at particular mutant number values, causing discrepancy between the distributions. Nonetheless, the Fréchet remains a good approximation.
Second, the lower tail of the Luria-Delbrück process spreads to lower values than the Fréchet. One can see this mismatch most clearly in the figure for Nu ≥ 100.
This mismatch may occur because the Luria-Delbrück process transitions from a highly stochastic process in earlier cellular generations to a nearly deterministic accumulation of mutations in later cellular generations, when the larger population size reduces the coefficient of variation in the number of new mutations. The Fréchet applies most closely to the earlier generations for the following reasons.
In an expanding population, the earliest mutation strongly influences the final number of mutants. An early mutant carries forward to all descendant cells in an expanding mutant

Amendments from Version 1
In Equation 1, I replaced m < z with m ≤ z so that the new equation is Any further responses from the reviewers can be found at the end of the article REVISED clone. If we start with the final cells and then look back through the cellular generations toward the original progenitor, the mutation with the most extreme time from the end toward the beginning tends to dominate the final mutant number.
The extreme value of a temporal extent often has a Gumbel distribution. In this case, once the mutation arises, it increases multiplicatively by cell division to affect the final mutation count. Substituting the extreme Gumbel time for its multiplicative consequence provides a common way to observe a Fréchet probability pattern.
Prior mathematical work also supports the Fréchet approximation. Kessler and Levine 10 showed that the Luria-Delbrück distribution converges to a Landau distribution for large Nu, in which the Landau distribution is a special case of the Lévy α-stable distribution. However, the Landau distribution does not have a closed-form expression for its probability or cumulative distribution functions.
Separately, Simon 11 showed the close match between the Lévy α-stable distribution and the Fréchet distribution. That match of a Lévy distribution to the Fréchet distribution had not previously been associated with the Luria-Delbrück distribution. The Fréchet parameterization in this article provides a simple expression that can be used to develop further theory and applications of the Luria-Delbrück process.

Software availability
The Julia software code used to produce Figure 1:

Open Peer Review
distribution, which microbiologists use to help determine microbial mutation rates in the laboratory. Specifically, equation (1) in the brief report is an approximation of the cumulative probability. If denotes the probability of mutants, the author implicitly defines the cumulative probability as .
The author's key finding is that , where is defined by equation (1) in the brief report. Note that the approximation in (1) is valid for any . However, as pointed out by the author, the approximation works well only for values of that are noticeably larger than . I have conducted a number of computer experiments and confirmed the numerical results in the brief report. The approximation is theoretically interesting, and it may stimulate further theoretical developments. Thus, the paper merits indexing.
I have a minor comment. There appears to be a typo in equation (1)  importantly, this may make the approximation more accurate for small . Consider the case (The symbol here is the same as the symbol in the brief report). Table 1 shows results obtained by using the revised definition, while Table 1A shows corresponding results obtained by using the original definition. In both tables, "error" refers to the following quantity:

If applicable, is the statistical analysis and its interpretation appropriate? Not applicable
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Yes
Competing Interests: No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
Author Response ( F1000Research Advisory Board Member ) 20 Feb 2023

Steven Frank
Thank you for the careful reading. With regard to the comment about m < z versus m <= z, the calculations to make figure 1 used m <= z for the empirical distribution, as recommended by the reviewer. For the theoretical continuous Frechet the numerical values are the same for the two cases. However, I agree that the notation in the original version of the manuscript is misleading. I will post a revised version that uses m <= z, as recommended.
Competing Interests: No competing interests were disclosed.
The benefits of publishing with F1000Research: Your article is published within days, with no editorial bias • You can publish traditional articles, null/negative results, case reports, data notes and more • The peer review process is transparent and collaborative • Your article is indexed in PubMed after passing peer review • Dedicated customer support at every stage • For pre-submission enquiries, contact research@f1000.com